Skip to content

[X86] Re-enable DA8W4 path on X86 CPU#4033

Draft
Xia-Weiwen wants to merge 4 commits intopytorch:mainfrom
Xia-Weiwen:fix_da8w4
Draft

[X86] Re-enable DA8W4 path on X86 CPU#4033
Xia-Weiwen wants to merge 4 commits intopytorch:mainfrom
Xia-Weiwen:fix_da8w4

Conversation

@Xia-Weiwen
Copy link
Collaborator

Summary
This PR re-enables DA8W4 path on X86 CPU with Int8DynamicActInt4WeightOpaqueTensorConfig and Int4OpaqueTensor and updates UT in test/quantization/test_da8w4_cpu.py

Test plan
python test/quantization/test_da8w4_cpu.py

@Xia-Weiwen Xia-Weiwen added the module: not user facing Use this tag if you don't want this PR to show up in release notes label Mar 10, 2026
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/4033

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit f58c16f with merge base 3d02561 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 10, 2026
@Xia-Weiwen Xia-Weiwen requested a review from Copilot March 10, 2026 05:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR re-enables the DA8W4 (dynamic int8 activation + int4 weight) CPU path on x86 by extending Int4OpaqueTensor to support DA8W4 packing/execution and adding a new quantization config and unit tests for the workflow.

Changes:

  • Add DA8W4 weight quantization + prepack (from_hp_da8w4) and a DA8W4 aten.linear dispatch path in Int4OpaqueTensor.
  • Introduce Int8DynamicActInt4WeightOpaqueTensorConfig and its module transform to apply DA8W4 quantization using Int4OpaqueTensor.
  • Restore/expand DA8W4 CPU unit tests (test/quantization/test_da8w4_cpu.py) and export the new config in the package __init__.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
torchao/prototype/int4_opaque_tensor/int4_opaque_tensor.py Adds DA8W4 weight quantize+prepack and dynamic-activation linear implementation using da8w4_linear_cpu.
torchao/prototype/int4_opaque_tensor/inference_workflow.py Adds a new DA8W4 config + quantize-module handler for Int4OpaqueTensor.
torchao/prototype/int4_opaque_tensor/init.py Exposes the new DA8W4 config in the public prototype package API.
test/quantization/test_da8w4_cpu.py Adds DA8W4 CPU tests validating compilation/codegen and basic accuracy.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


# DA8W4 path: dynamic int8 activation + int4 weight
if weight_tensor.act_mapping_type is not None:
if weight_tensor.act_mapping_type == MappingType.SYMMETRIC:
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

act_mapping_type is stored on Int4OpaqueTensor as a string ("symmetric"/"asymmetric"), but this dispatch compares it to MappingType.SYMMETRIC. That condition will never be true, so the symmetric-version gate here is ineffective and the code is inconsistent with _da8w4_linear (which checks the string). Make the representation consistent (e.g., store MappingType in the tensor attribute and compare against MappingType.*, or keep it as a string and compare against "symmetric").

Suggested change
if weight_tensor.act_mapping_type == MappingType.SYMMETRIC:
if weight_tensor.act_mapping_type == "symmetric":

Copilot uses AI. Check for mistakes.
Comment on lines +124 to +130
if config.set_inductor_config:
torchao.quantization.utils.recommended_inductor_config_setter()

assert hasattr(module, "weight"), (
"applying DA8W4 quant requires module to have weight attribute"
+ f" but {module} does not have one"
)
Copy link

Copilot AI Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DA8W4 module transform quantizes/prepacks weights unconditionally. If the DA8W4 CPU kernels aren’t built/registered (or if running on an older PyTorch that doesn’t support the needed path), this will still replace module.weight with an Int4OpaqueTensor and the first linear() call will fail at runtime. Consider adding an early guard here (similar to the unit test) that checks kernel availability via torch._C._dispatch_dump("torchao::da8w4_linear_cpu") and a torch_version_at_least("2.7.0") (and 2.8.0 for symmetric) before applying the transform; otherwise log and return the original module.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants